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Abstract 

We study statistical properties of an NP-complete problem, the subset sum, using 
the methods and concepts of statistical mechanics. The problem is a generalization 
of the number partitioning problem, which is also an NP-complete problem and has 
been studied in the physics literature. The asymptotic expressions for the number of 
solutions are obtained. These results applied to the number partitioning problem as 
a special case are compared with those which were previously obtained by a different 
method. We discuss the limit of applicability of the techniques of statistical mechanics 
to the present problem. 

1 Introduction 

The methods and concepts of statistical mechanics have turned out to be quite useful in 
the study of problems in computer science and related fields. In particular, the techniques 
which have originally been developed in the spin glass theory have been successfully applied 
to the investigation of the properties of NP-complete problems in the theory of compu- 
tational complexity Some of them include the travelling salesman p|, graph parti- 
tioning i^-SAT knapsack |Q, vertex cover and other problems Roughly 
speaking, NP-complete problems are a class of problems which are difficult to solve, in 
the sense that so far no one has succeeded in devising (and in fact it is believed to be 
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impossible to design) an algorithm to determine in polynomial time whether or not there 
is a solution to given input data. NP-complete problems have been extensively studied, 
but still pose many open questions 

The issue of primary interest to computer scientists is to find an algorithm which 
efficiently finds a solution to given input data, for which purpose statistical mechanics 
may not be of direct use because the latter is suitable to reveal typical properties of 
many-body systems. Recently, however, statistical properties of these problems have been 
receiving increasing attention since it has gradually been recognized that a generally hard 
problem can sometimes be solved relatively easily under certain criteria with the assistance 
of statistical mechanics ideas [|10| . 



For a wide class of NP-complete problems, the following situation happens. A problem 
has a parameter and, when the size of the problem becomes large, there appears a "critical" 
value of the parameter such that below it an algorithm can efficiently find a solution (easy 
region) but above it the same algorithm no longer works effectively (hard region). This 
happens because the deffnition of NP-completeness is based on the worst case analysis. A 
problem can be classiffed as a difficult one if there are only a few difficult instances. The 
sudden change of the statistical properties of a problem is in many respects similar to a 
phase transition, a concept from statistical mechanics. In fact, the methods for studying 
phase transitions have turned out to be powerful tools to understand the properties of the 
above-mentioned phenomena. These observations suggest that the typical case study will 
play increasingly important roles in computer science and accordingly the methods from 
statistical mechanics will provide useful tools. 

In typical case studies, one usually considers a randomized version of a problem. In 
other words, our main interest is in the properties of the problem averaged over possible 
realizations of input data. The randomized problems share many features with spin glass 
systems and have been often studied using the techniques of the spin glass theory. In 
particular the replica method has allowed us to analyze the problems, many of which 
would have been impossible to deal with without it. Nonetheless the resulting saddle 
point analysis, known as the problem of replica-symmetry breaking, is often so hard that 
it is usually difficult to get complete understanding of the problem. Hence, to gain more 
insights, it is important to study problems which are solvable without using replicas. 
The number partitioning problem seems to be an ideal example from this point of 
ITTHT^. Suppose that one is given a set of positive integers A = {ai, a2, ■ ■ ■ , aj^} and 
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asked to divide this into two subsets with the same value of the sums. In other words, one 
tries to find a subset A' (Z A which minimizes the partition difference 



A' A\A' 



A subset A' with zero partition difference is called a perfect partition, whereas a subset 
with a positive partition difference is termed an imperfect partition. It has been argued 
that this problem shows a sharp change of states, reminiscent of a phase transition, between 
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easy and hard regions [T^-|T5| . In addition, the problem has a lot of practical applications 
such as multiprocessor scheduling and minimization of VLSI circuit size. 

The analysis in starts from taking the partition difference to be the Hamiltonian. 

Then a perfect partition corresponds to a ground state of the Hamiltonian and an imperfect 
partition to a configuration with positive energy. By applying the statistical mechanics 
methods and a saddle point approximation in the large- limit, several results have been 
obtained without using replicas. The phase transition behaviour of the problem, found 
numerically ||T^, was understandable through those results. But the expressions obtained 
in his analysis show peculiar high temperature behaviours as will be shown below. In 
particular, the partition function does not give the correct entropy in the limit of high 
temperature. Hence his results are not expected to give reliable predictions for imperfect 
partitions. 

The main purpose of this paper is to propose an alternative approach to the number 
partitioning problem applicable to imperfect partitions as well. We study a generalized 
version of the number partitioning problem: the subset sum By using some basic 
concepts and methods of statistical mechanics, the asymptotic expressions of the number 
of solutions are obtained. Our results specialized to the number partitioning problem 
are compared with the previously obtained predictions. It is shown that our results are 
applicable to the cases where the predictions of the previous analysis do not agree with an 
exactly solvable example. Our discussions are mainly restricted to the easy region although 
the hard region could also be considered by similar arguments using the ideas in p!6 |. 

The rest of the paper is organized as follows. In the next section, we introduce the subset 
sum and reformulate it in terms of a Hamiltonian. By using the canonical ensemble, the 
asymptotic number of solutions is estimated in section 3. Based on the results, we discuss 
a crossover between easy and hard regions of the subset sum in section 4. In section 5, the 
analysis is generalized to the case with constraint. In section 6, we apply the results to 
the number partitioning problem and compare the results with those in [l^. Conclusion 
is given in the last section. 



2 Subset Sum 

Let us denote N+ = {1, 2, ■ ■ ■ }, the set of positive integers. The subset sum is an example 
of NP-complete problems in which one asks, for a given set of ^ = {ai, 02, ■ ■ ■ , a^} with 
ttj G N+ (j = 1, 2, ■ ■ ■ , A^) and E G N+, whether or not there exists a subset A' C A such 
that the sum of the elements of A' is E |T|. To formulate the problem, we introduce a 
Hamiltonian (or energy) 

N 

H = Y,a^n^, (2.1) 

i=i 

where rij G {0, 1} (j = 1, 2, ■ ■ ■ , A^), and the subset sum is equivalent to asking whether or 
not there exists a configuration {ni, ^2, ■ ■ ■ , n^r} such that H = E. A configuration which 
satisfies H = E is called a solution in the following. 
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There are several versions of the problem. The original one is the decision problem; one 
only asks whether there exists a solution or not. Once one learns that the answer to the 
decision problem is yes, however, it would be quite natural next to ask how many solutions 
there are. This is called the counting (or enumeration) version of the problem. On the 
other hand, if there is no solution, one might try to find the best possible configuration 
which minimizes the energy difference from the given E. This is the optimization version 
of the problem. Of course these versions are closely related to each other. In the following 
treatments, wc focus on the counting version of the problem, for which statistical mechanics 
provide powerful analytical tools. The number of solutions for a given energy E will be 
denoted by W{E). 

3 Statistical Mechanical Analysis of Subset Sum 

Evahiation of the exact value of W{E) for given A and E is still a question of complicated 
combinatorics and is very hard. In particular, fixing the value of is a very strong 
constraint which renders the counting almost intractable. In the terminology of statistical 
mechanics, considering the problem with a fixed value of E corresponds to working in the 
microcanonical ensemble. For many purposes in practice, however, one is interested in the 
asymptotic behaviours for a large N and is satisfied with approximate expressions of W{E) ; 
an exact expression of W{E) is unnecessary and can even obscure the essential aspects of 
the problem. The experience in the study of statistical mechanics tells us that, in order 
to know the asymptotic behaviour of W{E), it is much easier to work in the canonical 
ensemble. This is a superposition of the microcanonical ensembles for all possible values 
of the energy with the Boltzmann factor e~^^, where (3 is the inverse temperature. In this 
section, the set A is still fixed; statistics over many A is not considered. 

Simplicity of the analysis of the subset sum compared with other problems stems from 
a compact expression for the partition function. For a given A, the partition function Z 
is simply given by 

{nj} ni=0,l 712=0,1 niv=0,l 

= (1 + e-'^"^)(l + e"'^"^) •••(! + e-^'^^). (3.1) 

From this one can calculate the average values of various physical quantities. The average 
here means the thermal average and is denoted by (•••). The average energy (E) for a 
given value of /3 is given by 

W = -|logZ^5:^. (3.2) 

Note that the value of {E) can be controlled by changing f3. As f3 is increased from — oo 
to oo, the average energy (E) decreases from X^^i % to 0. In usual statistical mechanics. 
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the temperature and hence (3 should be positive. For our present problem, however, the 
temperature is introduced only as a parameter to control the average energy. A negative 
value of (3 is also allowed in our problem. The fluctuation of the energy is similarly 
calculated as 



Here we go back to (|3.1| ) and observe that W{E), the number of solutions to the 
condition H = E, appears as the coefficient of the ii^th power of q (= e~^) in Z; the 
expansion of Z in terms of q gives 

-^max 

Z=Y, W{E) q^ (3.4) 

E=0 

with -Emax = ai + a2 + ■ ■ ■ + aN- Moreover, since Z is a polynomial in q, ( ^^) can be 
inverted easily: W{E) has an integral representation 

with C being a contour enclosing the origin anticlockwise on the complex q plane. It is 
important to notice that the contour C in (|3.5|) can be deformed arbitrarily as far as it 
encloses the origin anticlockwise. 

Now we consider the asymptotics of W{E) as N oo. One should specify how this limit 
is taken since changing also implies changing A simultaneously. To avoid this difficulty, 
let us suppose for the moment that one first has an infinite set Aoo = {a-i, 02, ■ ■ ■ }; each 
element of which is taken from a finite set of {1, 2, ■ ■ ■ , L}, with L G N+. Then the set A 
can be regarded as a collection of the first N elements of • The limit — > cxd is defined 
without ambiguities in this way. 

Since aj satisfies 1 < aj < L for all j, a simple estimation of ( p.3|) shows that the 
fiuctuation of the energy is of order A^ when j3 is finite. Hence the fiuctuation of the energy 
per system size A^, i.e., {{E/N — {E)/NY), tends to zero as A^ — >• 00. Let j3o be the value 
of P such that the average energy (E) is equal to E. Then one takes the contour C in (|3.5| ) 
to be a circle with radius e~^° and uses the phase variable 6 defined by g = 6"^°+*^ to find 

271" J-n 

It is not difficult to verify that the saddle point of this integrand is at 6' = 0. (When the 
g.c.d. of A is not one, there appear other saddle points with the same order of contributions. 
But, when A^ is large enough, it is almost sure that the g.c.d. of A is one, which we assume 
in what follows.) 
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The quantities in the exponent on the right hand side of ( p.6| ) are all of order A^. 
Therefore we can use the method of steepest descent to evaluate the asymptotic behaviour 
of the integral. Expanding log Z around /3 = /5o to second order leads to 

W{E) = e'°^'\^-^o+P^^ -^l^ dOexp (^Jl-^\ogZ\^=^,^ . (3.7) 



Since the second derivative of log Z is the fluctuation of the energy ( |3.3| ) and is of order 
N, the integration range may be extended to ±00. The result is 

WiE) ^ [log ^1/^-/30 +/^o^ J (3.8) 
27r^logZ|^=^, 

Here and in the following the symbol ~ means that the ratio of the right and left hand sides 
tends to unity as ^ 00. After rewriting back to f3, we finally obtain the asymptotic 
expression of W{E) for a given value of E via a common parameter j3 as 



exp 



log(l + e-^^^) + (3 Ef=i a,/ {I + 



W{E) ^ ^ , (3.9) 



TV 



(3.10) 



A numerical check of (|3.9|) and ( |3.10| ) is shown in Fig. 1. As far as one sees on this scale, 



the agreement of our predictions and simulational data is satisfactory for the entire range 
of the energy. 

One may notice that the obtained expressions, ( ^^ and (|3.10|) , do not depend on L. 



This is plausible since the number of solutions depends only on the elements of A, not 
directly on the set from which elements of A have been taken. One should remember in 
this relation that the validity of the saddle point analysis depends on L. The expressions, 
]9D and (|3.10|) , become better approximations as A^ — > 00 for a fixed value of L. It would 



not be surprising if the expressions do not agree very well with simulational data when L 
is sufficiently large so that W{E) is of 0(1). In particular one should not use ( p.9|) in the 
parameter region which gives W{E) < 1 as will be discussed below. 



4 Easy/ Hard Regions 



Numerical simulations in [|T^] suggest that there exist easy and hard regions for a random- 
ized version of the subset sum, in which one considers statistics over many samples of A 
with each aj drawn from {1,2,... , L} uniformly. In simulations, one checks if there exists 
a solution for many samples of A with given A^, L and E. Then for each A^ and E, one plots 
the probability that there is at least one solution as a function of k = log2 L/N. Then it is 
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observed that the probabihty decreases fairly sharply from 1 to as k increases from zero 
to oo. As becomes larger, the decrease of the probability occurs in a narrower range 
of K. In fact, the system appears to have a sharp transition at a critical value Kc in the 



limit of A^, L — > cxD [|^. In this section, we estimate the critical value Kc of the randomized 
subset sum by using the results of the last section. 

The analysis in the last section gives us the asymptotic formula for a fixed A. To apply 
the results to the randomized version of the problem, one has to notice that, as A^ becomes 
large, the sample dependence of ( |3.9|) and ( p.lOj ) is suppressed increasingly. In fact, in the 
limit N ^ oo with L fixed, there would be no sample dependence so that the average 
properties of these quantities coincide with those of a typical sample. To see this, let us 
define the density of yj = aj/L (1 < j < A^) to be pwiv) = jf J2j^=i ^iv " Vj)- Since we 
draw the aj uniformly, we have lim pN{y) = p{y) where p{y) = 1 for < ?/ < 1 and 

p{y) = otherwise. In addition, in this limit, summations in ( p.9|) and (|3.10|) are replaced 
by integrals, resulting in 



exp 



N dy {log(l + e-^y) + ay /{I + e"^)} 



W{E) ^ ' ^ (4.1) 



27rA^L2 dyy^/{l + e"f)(l + e 



-ay] 



X = ~ f dy—^ — , (4.2) 

where we have introduced a scaled inverse temperature a = (3/ L. The parameter a controls 
X, the energy divided by A^ ■ L; as a is increased from — oo to oo, x decreases from 1/2 to 
0. The validity of these expressions is determined only by the values of A^ and L. For a 
given A^, they are valid for sufficiently small L. Even though L changes, however, these 
expressions are expected to be good approximations as long as A^ is relatively large or when 
K is fairly smaller than k^.. On the other hand, the reliability of these expressions is unclear 
for K, ^ Kc- In fact, there is evidence that the average minimal cost is not self- averaging in 



this region [|I2[. The value of k below which the above formulas are valid increases as A^ 
and L increase, and finally it reaches Kc in the limit N, L ^ oo. It is important to notice 
that the value of Kc can be determined by the condition W{E) = 1 in the limit N,L ^ oo 
because W{E) is the expectation value of the number of configurations. We therefore find 



log 2 



\y log(l + e-"^)+ 







1 + e°'y 



(4.3) 



with a determined by ( [4.2| ) for a given value of x. For k < Kc, exponentially many solutions 
are expected to exist and one of them can be found fairly easily. On the other hand, for 
K > Kc, there is practically no solution and hence it is virtually impossible to find one. The 
easy/hard regions of the randomized subset sum are shown in Fig. 2. 
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5 Constrained Case 



In some applications, one might encounter a situation where the number of Oj's is given. In 
this section, our previous analysis is generalized to the constrained case where the number 
of chosen Oj's is fixed to M. Instead of considering directly the system with constraint, we 
again take a superposition of the problems with various values of M. In the language of 
statistical mechanics, we work in the grand canonical ensemble. Let us define a Hamiltonian 

N N 

The first term is nothing but the Hamiltonian for the unconstrained subset sum. The 
second term is introduced to control the number of a^-'s by changing the parameter /i, the 
chemical potential. The grand partition function is evaluated as 



{rij} 

(1 + e^'e-^'''){l + e^e-^"^) ■ ■ ■ (1 + e^e"'^"^; 



N E, 



max 



^ W{M, E)e'^*'e-^^, (5.2) 



M=0 E=0 



with -Emax = ai + a2 + ■ ■ ■ + ttN as before. Here W{M, E) is the number of configurations 
which satisfy ^1^=1 '^j^j ~ ^ SjLi ~ ^ simultaneously. 

For given values of /i and (3, the average number, energy and second moments of these 
quantities are expressed as 

f) ^ 1 

,„ge = $:^^-^. (5,3) 



d 



N 



i=i 

32 ^ 



((M-(M))2) = ^loge = y- ^ -^^ — — , (5.5) 

\v \ /J / (x + e^"j-^)(l + e-'3%+M)' ^ > 

{{M - {M)){E - {E))) = -——\ogQ = y- -— -, (5.6) 

Similarly to the unconstrained case, one can show that the fiuctuations of the number and 
energy divided by the system size vanish as — » oo so that one can apply the saddle point 
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method. The resulting asymptotic expression for W{M, E) reads 



W{M,E) 



exp[loge + (3E - ^M] 



(5. 



with D being 



D 



loge 



logG 



loge 



a/32 



(5.9) 



The values of M and E are given by ( ^.3|) and ( ^.4|) , respectively. Using these expressions, 
one can discuss the easy/hard regions of the constrained subset sum. The analysis is almost 
the same as that in the last section and is omitted here. 



6 Number Partitioning Problem 



As already mentioned in the introduction, the subset sum is regarded as a generalization 
of the number partitioning problem. In this section, we apply our previous discussions to 
the number partitioning problem. Our results are compared with those in |jl3|,|14|, which 
are briefly reviewed in Appendix A with some remarks. 

Let us first establish an explicit relationship between the subset sum and the number 
partitioning problem. If one introduces the spin variables by 



it is not difficult to see 



H :-- 



Sj = 2nj - 1, 



N 



2H-J: 



N 



ajSj 



(6.1) 



(6.2) 



This is exactly the Hamiltonian of the number partitioning problem studied in |1T3|, |1^] . 
The number of solutions of H = E, which we denote by W{E), is related to W{E) by 



W{E) 



W{lE + lEf=,a,) + W{-lE + lEU 



1 v^A^ 



iE = 0) 
{E > 0) 



(6.3) 



0, a solution of which is called a perfect solution 
In the subset sum, this corresponds to the energy 
^ aj. Clearly there is no perfect solutions if is odd; we assume clj 



Of special interest is the case of E 
in the number partitioning problem. 

is even in the following. In terms of /3, considering perfect solutions corresponds to (3 
from (|3l0|) . Setting /3 = in (U) leads to 







which is expected to be the number of perfect solutions to the number partitioning problem. 
In fact (|6.4|) agrees with the previously obtained result in [|13],|14|. Hence, as far as the 



number of perfect solutions for the number partitioning problem is concerned, our method 



13, 141 becomes manifest for a finite 



gives exactly the same answer as in [0,|T4|. 

The difference between our formula and that in 
value of E. We demonstrate this by considering the number partitioning problem for a 
special case where ai = a2 = ■ ■ ■ = cln = ^ with even. In this case, the Hamiltonian reads 
H = I Ylf=i^j\y is possible to write down the partition function Z = X]{s } 



explicitly: 



N 

E 

N 

N 
2 



N 



-m-'2j\ 



N/2 



N 



N 

+J 



(6.5) 



This formula indicates that there are solutions for even E and that W{E) is 

2N 



W{E) 



N 
N/2 



N 



N/2+3 



2 ex p N [-(I - ^) log(i - X) _ ( 1 + X) iog( 1 
^2vriV(i + X)(i_i) 



(E = 0) 
{E = 2j) 
(6.6) 



where the asymptotics are also indicated. For the present case with aj = 1 (1 < j < A^), 
( p.lOp is simply reverted as [3 = \og{N/E — 1). Then, using Stirling's formula, one can 
confirm that our formula (|3.9|) gives correct asymptotics in (|6.6|) for the entire range of 



energy. By contrast the partition function in |T3|,|T4|, which we denote by Z', for the present 
case can be written as 



Z' 




(6.7) 



This indicates that there are solutions for even E and that W{E) is asymptotically 2^ 
for E = and 2^+^ 
predicted only for E 



-N 

2^^ 



IN for E 



2j. As is clear from ( |6.6|) , the correct asymptotics is 
0. One may notice that the arguments in [T^, n] are somewhat 
different from ours. There, the energy E and the entropy S are calculated from the parti- 
tion function (|6.7|), following the usual prescriptions of statistical mechanics. It is assumed 
that exp(S') gives the number of solutions. The obtained expressions again do not give the 



correct asymptotics of W{E) when E > 0. Our conclusion is that the results of |ll3|,[l4 
give the correct asymptotic value for = but not for > 0. 
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The reason for this difficulty is traced back to the apphcation of statistical mechanics 
techniques to the system for which the number of solutions oi H = E decreases as E 
increases. It can be seen from Fig. 1 that the number partitioning problem is indeed an 
example with this anomalous property if one notes that = corresponds to the peak 
of the curve. The problem is that, for such systems, usual prescriptions of the canonical 
ensemble do not work. For normal physical systems, the number of states W{E) increases 
as a function of the energy E. The number of states multiplied by the Boltzmann factor 
W{E)e~^^ takes a maximum at some value of E. The peak around this point becomes 
drastically sharp as the system size increases. Then the equivalence of microcanonical 
and canonical ensembles holds so that we can study the thermodynamic behaviours of the 
system in either ensemble. For systems with decreasing W{E), however, W{E)e~^^ is a 
monotone decreasing function. The fluctuation of the energy does not tend to zero even 
when the system size increases indefinitely, and consequently one can not control the energy 
by changing the temperature. As a result, the equivalence of ensembles does not hold. The 
exponential of the entropy calculated in the canonical ensemble and the coefficient of the 
expansion of Z in powers of e~^^ do not agree even in the thermodynamic limit; in addition, 
neither of these quantities give the correct asymptotics of the number of configurations 
W{E). This problem may be overcome by considering a negative temperature as we did 
for the subset sum, but a direct analysis of the number partitioning problem described by 
( |6.2|) would then be much more difficult. 

Before closing this section, we discuss the randomized number partitioning problem 
with a constraint in which one asks whether or not there exists a perfect solution with 
SjLi fixed. In the language of subset sum, this corresponds to fixing M since J2f=i = 
Ej=i(2nj -1) = 2M-N. In |2|,|1§, it has been found that there is a phase transition in 
the limit M, iV — > oo with m = 2M/N — 1 fixed. We can reproduce this phenomenon from 
the results of section 5. In the limit A^, L and M ^ oo, summations in ( |5.3| ) and ( |5.4| ) are 
replaced by integrals. They are written as, for a uniform distribution, 

1 1 1 e" + 

9 + ^) = / ^ = 1 - - log (6-^) 

2 Jq 1 + e'^y ^ a 1 + 

x = [ dy- , (6.9) 

where a = j3/L and x = E/N ■ L as before. Since (|6.8|) is easily reverted as 

gQ _ ga(l-m)/2 

= e»(i-H/2 _ I ' (6-10) 

one can regard x as a function of a and m. Then, for a fixed m (—1 < m < 1), one sees 
that X decreases from (1 + m)(3 — m)/8 to (1 + m)^/8 as a is increased from — oo to oo. 
There are few or no configurations for energy corresponding to x outside this range. If 
we note that the perfect solution corresponds to x = 1/4, we find that there are extensive 
number of perfect solutions when \m\ < \/2 — 1 and there is practically no perfect solution 
when \m\ > \f2 — 1. Hence we conclude that there is a phase transition at = \f2 — 1, 
in agreement with the previous analysis . 
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7 Conclusion 



We have studied the statistical properties of the subset sum, which is a generahzation of 
the number partitioning problem. The basic ideas and methods of statistical mechanics 
enabled us to study the asymptotic behaviour of the number of solutions for a given set 
of input data. The expressions (|3.9|) and ( p.lO|) represent the main results of this paper. 



The agreement of the predictions with simulational data have been found satisfactory. 
Our results have been compared with those which were previously obtained by a different 
method. They agreed with each other for the number of perfect solutions of the number 
partitioning problem. On the other hand, in the case of the subset sum, only our analysis 
gave the correct asymptotics over the entire range of energy. The reason why the validity of 
the results in |]13|,0] is restricted to perfect solutions has been argued to be that the entropy 



calculated in canonical ensemble does not necessarily give the logarithm of the number of 
configurations for systems with a decreasing number of states as the energy increases. In 
such anomalous systems, one should be extremely careful in using the equivalence between 
microcanonical, canonical, and grand canonical ensembles. 
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Appendix A Some Remarks on the Analysis in [[T3l , 

m 

In this Appendix, we briefly review the results in [TBI, III | and give a few remarks on the 
analysis. Our notations are slightly different from the original ones for consistency with 
the main text. The g.c.d. of A is assumed to be unity in the following. After some 
manipulations, the partition function Z for the Hamiltonian (|6.2|) is rewritten as 



Z = y e-^^ = 2^ ^e^Giy)^ ^A.l) 

where j3{> 0) is the inverse temperature and 

1 ^ 

G{y) = j;^Y^\ogcos{l3ajt8.ny). (A.2) 
i=i 

Then it has been argued that, for a large A^, the integral in (|A.1|) can be evaluated using 
the Laplace method. There exist an infinite number of points which give the maximum 
value of Re{G(?/)}. The main contributions are expected to come from the points 

Uk = arctan f J ' ^ = 0, ±1, ±2, .... (A. 3) 
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It is not difficult to con&m Re{G{y)} < for general y and Re{G{yk)} = G'{yk) = 
and G"{yk) < 0, so that y^s of (|A.3|) certainly give the maximum of Re{G{y)}. The 
contributions from these y^s can be summed up explicitly, and the result is 



; J—'. 



k=—oo 
2N 



dy 



Yl,^=l k=-oo 



- (-l)fcA 



2]V 



coth (3 (A : even) 
cosech/3 (A : odd) 



(A.4) 



where A = Yl!j=i'^j- 
A is missing. (Also 



ra,iTi 



the formula for the case of odd 
the results are 
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Here we remark that in 
in the arguments of the constrained case in 
presented only for the case of even A^, but one has to consider the case of odd A^ separately.) 
Nevertheless, we only consider the case where A is even in the following, because the odd 
A case can be discussed similarly. 

One notes that the expression ( [A.4| ) diverges as /5 while the correct limiting value 
is clearly 2^ from the definition ( [A.l|) . Hence, as mentioned in the main text, his result 
( |A.4| ) is not valid at least for small (3 (or large T). 
In p|,|l3 



the hard region was also discussed using (|A.4| ). In particular, the average 
minimum cost was estimated. One should use the finite-temperature expression of the 
partition function ([A.4| ) to analyze the non-vanishing value of average minimum cost in 
the hard region. However, since ( [A.4|) is not reliable for large T, the formulas given 
in |jT3|,|l4[ should be taken with special caution. The principal source of trouble is in the 



anomalous properties of the systems with the decreasing number of configurations as the 
energy increases. There is another point of problems in his analysis as we discuss in the 
following. 

A sign of difficulty is seen from the negative value of the entropy in the hard region. 
The entropy calculated from the partition function ( [A.4|) reads 



S = log 



2 l^j 



N 



coth P + 



/3 



-1 H 



sinh (5 ■ cosh /3 



(A.5) 



The ground state entropy 5*0 = lim^^oo S is found to be 

^o = {Ar-Ar,(^)}log2, 

with 



1 TT ^ 

Ar,(^) = -log2^^a 



2 2 ^ ^ 



(A.6) 



(A.7) 
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This is equivalent to ( |6.4| ). One notices that, when N < Nc{A), the ground state entropy 
is negative. In [|l3|,|l3, the easy (resp. hard) region is characterized by a positive Sq (resp. 
a negative ^o), i.e., by > Nc{A) (resp. < Nc{A)). In an appropriate limit, this 
coincides with k < Kc (resp. k > Kc). To avoid the difficulty of negative entropy in the 
hard region, the author of ll5,|14| proposed not to take the P ^ oo limit but to use ( |A.4| ) 



only down to the temperature where S > log 2. This is an arbitrary process which would 
not be necessary if we use the exact expression of the entropy. 

To identify the problem within his formalism, let us remember that ( |A.4| ) was obtained 
by summing up only the contributions from around extreme points {yk} of (|A.3|) . In 
the easy region (A^ ^ Nc{A)) the peaks around these points are very sharp and hence 
the Laplace method gives a good approximation. On the other hand, in the hard region 
(A^ ^ Nc{A)), there appear a large number of other local maxima, with values not so far 
from zero and at points located fairly close to the points {yk} of (|A.3|) . One will be easily 
convinced that this happens by checking a very simple example of A^ = 2. In this case 
G{y) reads 

G{y) = - logcos(/3ai tany) + - logcos(/3a2 tany), (A. 8) 

where oi, 02(01 < 02) are coprime natural numbers with 02 sufficiently large corresponding 
to the hard region. 
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Figure Captions 



Fig. 1: The number of solutions W{E) as a function of the energy E for an example with 

AT = 20, L = 256, and A = {218, 13, 227, 193, 70, 134, 89, 198, 205, 147, 227, 190, 27, 239, 192, 131}. 

The theoretical prediction is indistinguishable from the numerical results plotted in dots. 



Fig. 2: The easy/hard regions of the randomized subset sum. 



W{E) 




1000 2000 3000 4000 5000 



Figure 1 



X 




Figure 2 



